15 research outputs found

    Design of a diversity enforcement module for safety critical processing systems

    Get PDF
    Safety-critical systems must adhere to specific functional safety standards describing the development process for those systems. One key requirement is the ability to avoid a single fault from causing a system failure, or in other words, avoiding Common Cause Failures (CCFs). Redundancy is a usual solution against CCFs. However, some specific CCFs may affect redundant components identically (e.g., voltage droops, clock interferences), hence potentially leading to identical errors that may go unnoticed and cause a failure. Diversity is often deployed along with redundancy to avoid also those CCFs. In the particular case of computing elements (e.g., cores), this is usually realized with some form of lockstep execution where two identical cores execute the same software, but with some time shift among them (aka staggering). Therefore, both cores have different state at any point in time and faults affecting both cores lead to different errors, which can be detected by comparing the outputs. Unfortunately, existing solutions have some non-negligible costs: (i) hardware-only solutions hide half of the cores making them non-user visible, hence halving platform performance even for non-critical tasks. Conversely, (ii) software-only solutions are much more flexible but impose the use of a third core to run the lockstep monitor, and require large staggering which has significant impact in performance for short programs. This thesis devises a new solution aiming at combining the advantages of existing solutions. Our proposal, a hardware diversity-enforcement module (referred to as SafeDE), is an efficient hardware realization of the software monitor. Therefore, it does not hide any core to the end user, it does not require a third core for monitoring purposes, and allows operating with tiny staggering (e.g., few tens of cycles instead of hundreds of thousands as required for the software-only solution). We implement and integrate SafeDE in a space multicore prototype in an FPGA and validate that it effectively achieves its requirements with negligible hardware costs. Moreover, this work has already led to the publication of two peer-reviewed articles in especialized conferences and journals

    Desarrollo de red de sensores IoT de bajo coste y bajo consumo

    Get PDF
    El objetivo de este trabajo es desarrollar una red de nodos de sensores de temperatura IoT (Internet of Things) alimentados mediante batería. Los nodos sensores deben ser pequeños y de muy bajo consumo. Asimismo, tienen que conectarse a Internet y almacenar las temperaturas registradas por todos los sensores de la red. Posteriormente el usuario tiene que poder visualizar las medidas tomadas.Para abordar este problema se ha optado por diseñar un nodo central que se comunica mediante Wi-Fi, tanto con el rúter de la red local, como con el resto de nodos sensores de la red. Para almacenar las temperaturas se ha empleado tanto la propia memoria del módulo Wi-Fi del nodo central (ESP-12E), como un servidor proporcionado por la plataforma thinger.io, empresa dedicada a proveer servicios del IoT. Las temperaturas almacenadas en la nube, se pueden visualizar usando las herramientas que la plataforma thinger.io proporciona. Para visualizar las temperaturas almacenadas en el nodo central es necesario emplear un navegador conectado a la misma red local que la red de sensores.Para conseguir esto ha sido necesario desarrollar el firmware de los módulos Wi-Fi. Este firmware gestiona las conexiones Wi-Fi, la toma, transmisión y recepción de las temperaturas, la gestión de los ficheros y los archivos de texto, la obtención de la fecha y la hora a través de internet y la conexión con el servidor de thinger.io. También ha sido necesario desarrollar una página web que permita visualizar desde el navegador las temperaturas almacenadas en el nodo central.En cuanto al hardware, se ha diseñado el circuito sensor de temperatura, basado en un termistor NTC, así como los circuitos de carga de la batería y alimentación del sensor y del módulo Wi-Fi. A partir de dichos diseños se han fabricado tres prototipos, el nodo central, alimentado directamente desde la red, y dos sensores más, alimentados con baterías. Se ha comprobado que estos prototipos funcionan correctamente.A la hora de hacer el diseño del hardware se ha procurado que el precio y el consumo sean lo más bajos posibles. Aunque en los prototipos se han fabricado con componentes convencionales, también se ha realizado el diseño de una PCB con componentes SMD para intentar reducir el tamaño final del diseño.<br /

    SafeDE: A low-cost hardware solution to enforce diverse redundancy in multicores

    Get PDF
    Failure risk must be tiny in high-integrity systems, such as those in cars, satellites and aircraft. Hence, safety measures must be deployed to avoid a single fault leading to a failure. Redundancy has been often used to address this concern, but it has been proven insufficient if a single fault can cause the same error in all redundant elements, which defeats the purpose of redundancy for error detection. Hence, to avoid this scenario, diversity is implemented along with redundancy, being lockstep execution the most popular diverse redundancy solution for computing cores. However, classic lockstep solutions have non-negligible limitations if implemented in hardware (e.g., half of the cores can only be used for redundant execution and are not even visible at user level), or in software (e.g., the software loop to enforce staggering is long and costs performance). This paper tackles the limitations of classic lockstep solutions by providing an extended analysis and evaluation of SafeDE, a Diversity Enforcement hardware module combining the short loop to enforce diversity of hardware solutions, and the nonintrusiveness of software solutions. Hence, cores can operate in lockstep mode efficiently or run independent tasks. In this paper, we present SafeDE and its rationale, its application to N-modular systems, its hardware and software integration, and an evaluation showing its performance and area efficiency, and its behavior in the presence of faults.This work was supported in part by the European Union’s Horizon 2020 Research and Innovation Programme under Grant 871467, and in part by the Spanish Ministry of Science and Innovation under Grant PID2019-107255GB-C21/AEI/10.13039/501100011033.Peer ReviewedPostprint (author's final draft

    SafeDM: a hardware diversity monitor for redundant execution on non-lockstepped cores

    Get PDF
    Computing systems in the safety domain, such as those in avionics or space, require specific safety measures related to the criticality of the deployment. A problem these systems face is that of transient failures in hardware. A solution commonly used to tackle potential failures is to introduce redundancy in these systems, for example 2 cores that execute the same program at the same time. However, redundancy does not solve all potential failures, such as Common Cause Failures (CCF), where a single fault affects both cores identically (e.g. a voltage droop). If both redundant cores have identical state when the fault occurs, then there may be a CCF since the fault can affect both cores in the same way. To avoid CCF it is critical to know that there is diversity in the execution amongst the redundant cores. In this paper we introduce SafeDM, a hardware Diversity Monitor that quantifies the diversity of each redundant processor to guarantee that CCF will not go unnoticed, and without needing to deploy lockstepped cores. SafeDM computes data and instruction diversity separately, using different techniques appropriate for each case. We integrate SafeDM in a RISC-V FPGA space MPSoC from Cobham Gaisler where SafeDM is proven effective with a large benchmark suite, incurring low area and power overheads. Overall, SafeDM is an effective hardware solution to quantify diversity in cores performing redundant execution.EU’s Horizon 2020 grant no. 871467 and Spanish MSI grant PID2019-107255GB-C21/AEI/10.13039/501100011033.Peer ReviewedPostprint (author's final draft

    Unboxing the sand: on deploying safety measures in the programmable logic of COTS MPSoCs

    Get PDF
    The lack of sufficient hardware support for functional safety precludes the full adoption of many Commercial Off-the-Shelf (COTS) MPSoCs in safety-related systems, such as those in the aerospace industry. Some recent MPSoCs come along with programmable logic (PL), primarily intended to offload some specific complex functions that can be much more efficiently implemented in hardware than in software, hence being such PL a kind-of-sandbox fully mastered by ASIC cores outside the PL. This paper proposes using PL in those COTS MPSoCs to deploy the support needed to implement safety measures efficiently to enable the use of those MPSoCs for systems needing high assurance levels. Hence, the goal is not mastering PL from the cores solely, but also allowing PL to provide monitoring (e.g. contention, diversity, watchdogs) and control (e.g. configuring QoS features) capabilities to enable the realization of a safety concept atop. The early work presented in this paper already provides specific monitoring, diversity, and controlling strategies to allow PL take over safety-related functionalities.This work is part of the project PCI2020-112010, funded by MCIN/AEI/10.13039/501100011033 and the European Union “NextGenerationEU”/PRTR, and the European Union’s Horizon 2020 Programme under project ECSEL Joint Undertaking (JU) under grant agreement No 877056. This work has also been partially supported by the Spanish Ministry of Science and Innovation under grant PID2019-107255GBC21 funded by MCIN/AEI/10.13039/501100011033.Peer ReviewedPostprint (published version

    SafeSoftDR: A library to enable software-based diverse redundancy for safety-critical tasks

    Get PDF
    Applications with safety requirements have become ubiquitous nowadays and can be found in edge devices of all kinds. However, microcontrollers in those devices, despite offering moderate performance by implementing multicores and cache hierarchies, may fail to offer adequate support to implement some safety measures needed for the highest integrity levels, such as lockstepped execution to avoid so-called common cause failures (i.e., a fault affecting redundant components causing the same error in all of them). To respond to this limitation, an approach based on a software monitor enforcing some sort of software-based lockstepped execution across cores has been proposed recently in [2], providing a proof of concept. This paper presents SafeSoftDR, a library providing a standard interface to deploy software-based lockstepped execution across non-natively lockstepped cores relieving end-users from having to manage the burden to create redundant processes, copying input/output data, and performing result comparison. Our library has been tested on x86-based Linux and is currently being integrated on top of an open-source RISC-V platform targeting safety-related applications, hence offering a convenient environment for safety-critical applications.This work is part of the project PCI2020-112010, funded by MCIN/AEI/10.13039/501100011033 and the European Union “NextGenerationEU”/PRTR, and the European Union’s Horizon 2020 Programme under project ECSEL Joint Undertaking (JU) under grant agreement No 877056. This workhasalsobeen partially supported by the Spanish Ministry of Science and Innovation under grant PID2019-107255GB-C21 funded by MCIN/AEI/10.13039/501100011033.Peer ReviewedPostprint (published version

    SafeSU-2: a safe statistics unit for space MPSoCs

    Get PDF
    Advanced statistics units (SUs) have been proven effective for the verification, validation and implementation of safety measures as part of safety-related MPSoCs. This is the case, for instance, of the RISC-V MPSoC by CAES Gaisler based on NOEL-V cores that will become commercially ready on FPGAs by the end of 2022. However, while those SUs support safety in the rest of the SoC, they must be built to be safe to be part of commercial products. This paper presents the SafeSU-2, the safety-compliant version of the SafeSU. In particular, we perform a Failure Mode and Effect Analysis (FMEA) for the SafeSU for relevant fault models, and implement fault detection and tolerance features needed to make it compliant with the requirements of safety-related devices in general, and of space MPSoCs in particular.This work has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement no. 871467. This work has also been partially supported by the Spanish Ministry of Science and Innovation under grant PID2019-107255GBC21/AEI/10.13039/501100011033.Peer ReviewedPostprint (author's final draft

    SafeX: Open source hardware and software components for safety-critical systems

    Get PDF
    RISC-V Instruction Set Architecture (ISA) emerges as an opportunity to develop open source hardware without being subject to expensive licenses or export restrictions. A plethora of initiatives are nowadays developing systems-on-chip (SoCs) and its components based on RISC-V targeting a wide variety of markets. However, domains with safety requirements, such as avionics, space, and automotive, impose SoCs to include support to meet those requirements.This work introduces the SafeX family of components, a set of components providing SoC controllability, observability and safety measures support. These components, developed by the Barcelona Supercomputing Center with permissive open source licenses, are intended to be the basis to make SoCs meet the needs of domains with safety requirements. In particular, the SafeX components developed so far include the SafeSU (multicore statistics unit), the SafeTI (flexible and programmable traffic injector), the SafeDE and SafeSoftDR (hardware and software modules to enforce lockstep execution), and the SafeDM (module to monitor diversity across cores).This work has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement no. 871467. This work has also been partially supported by the Spanish Ministry of Science and Innovation under grant PID2019-107255GB-C21 funded by MCIN/AEI/10.13039/501100011033.Peer ReviewedPostprint (author's final draft

    End-to-end QoS for the open source safety-relevant RISC-V SELENE platform

    Get PDF
    This paper presents the end-to-end QoS approach to provide performance guarantees followed in the SELENEplatform, a high-performance RISC-V based heterogeneous SoC for safety-related real-time systems. Our QoS approach includes smart interconnect solutions for buses and NoCs, along with multicore interference-aware statistics units to, cooperatively, achieve end-to-end QoS.This work has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement no. 871467. BSC work has also been partially supported by the Spanish Ministry of Science and Innovation under grant PID2019-107255GB-C21/AEI/10.13039/501100011033.Peer ReviewedPostprint (published version

    De-RISC: A complete RISC-V based space-grade platform

    Get PDF
    The H2020 EIC-FTI De-RISC project develops a RISC-V space-grade platform to jointly respond to several emerging, as well as longstanding needs in the space domain such as: (1) higher performance than that of monocore and basic multicore space-grade processors in the market; (2) access to an increasingly rich software ecosystem rather than sticking to the slowly fading SPARC and PowerPC-based ones; (3) freedom (or drastic reduction) of export and license restrictions imposed by commercial ISAs such as Arm; and (4) improved support for the design and validation of safety-related real-time applications, (5) being the platform with software qualified and hardware designed per established space industry standards. De-RISC partners have set up the different layers of the platform during the first phases of the project. However, they have recently boosted integration and assessment activities. This paper introduces the De-RISC space platform, presents recent progress such as enabling virtualization and software qualification, new MPSoC features, and use case deployment and evaluation, including a comparison against other commercial platforms. Finally, this paper introduces the ongoing activities that will lead to the hardware and fully qualified software platform at TRL8 on FPGA by September 2022.This project has received funding from the European Union’s Horizon 2020 Research and Innovation programme under Grant Agreement EIC-FTI 869945. BSC work has also been partially supported by the Spanish Ministry of Science and Innovation under grant PID2019-07255GBC21/AEI/10.13039/501100011033.Peer ReviewedPostprint (author's final draft
    corecore